Skip to content

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Feb 29, 2016

This doesn't work, because out of some reason in_macro does not trigger for the for-loops in for_loop_over_option_and_result in compile-fail/for_loops.rs. My visit_name code might be wrong, because I'm not sure what happens when a hir::Ident is visited (are both the hygienic and non-hygienic name visited?), but even then should the span + in_macro prevent this.

Help? 🆘

cc #644

@mcarton
Copy link
Contributor

mcarton commented Feb 29, 2016

Your problem is not because of macros but because of the for loop lowering, which does not change the expn_id so in_macro ignores it.

@mcarton
Copy link
Contributor

mcarton commented Feb 29, 2016

Also I’m not sure about that lint being Warn by default, it would lint foo_x and foo_y but not x and y and it would lint for item in items. Looks like a lot of false positive.

@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 29, 2016

Also I’m not sure about that lint being Warn by default, it would lint foo_x and foo_y but not x and y and it would lint for item in items. Looks like a lot of false positive.

sgtm

@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 29, 2016

Your problem is not because of macros but because of the for loop lowering, which does not change the expn_id so in_macro ignores it.

http://manishearth.github.io/rust-internals-docs/syntax/codemap/struct.ExpnInfo.html suggests that if let also changes the expn_id... so it's a bug that for doesn't change it?

@mcarton
Copy link
Contributor

mcarton commented Feb 29, 2016

I don’t think that’s what it suggest (ok, let’s say the doc is not up-to-date 😄, NameAndSpan clearly shows that only macros are supposed to have an ExpnInfo). The span for the ident that causes your error is the span of the entire for loop because the ident comes from the loop desugaring, but the expn_id of that span is not changed. Looking at the source, it looks like this is intentional and consistent with if let and for desugaring.

@mcarton
Copy link
Contributor

mcarton commented Feb 29, 2016

BTW, why don’t you make that an EarlyLint? This happens before desugaring.

@oli-obk oli-obk changed the title WIP: lint on binding-names that are too similar lint on binding-names that are too similar Mar 1, 2016
@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 1, 2016

made it an EarlyLint and got rid of false positives by going through the following list:

  1. white-listed lhs and rhs
  2. white-listed everything that begins with a whitelist element + underscore
  3. white-listed everything that ends with an underscore + whitelist element
  4. allow everything that only differs in the first char, and the first char is separated from the rest with an underscore
  5. allow everything that only differs in the last char, and the last char is separated from the rest with an underscore

@@ -90,12 +90,12 @@ impl Constant {
impl PartialEq for Constant {
fn eq(&self, other: &Constant) -> bool {
match (self, other) {
(&Constant::Str(ref ls, ref lsty), &Constant::Str(ref rs, ref rsty)) => ls == rs && lsty == rsty,
(&Constant::Str(ref ls, ref l_sty), &Constant::Str(ref rs, ref r_sty)) => ls == rs && l_sty == r_sty,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think there is any value added by all those _, IMO it’s as easy to distinguish between l and r as between l_ and r_. The lint seems more annoying than usefull :s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xD i thought so, too. But once I added the underscore, I found it much better. If it were a type, you wouldn't name it Rsty but you'd name it RSty. This is the snake-case variant of that.

@mcarton
Copy link
Contributor

mcarton commented Mar 1, 2016

Ok, so let’s see how annoying is your lint in the wild 😄:

  • It triggered a lot in clippy itself.
  • On the regex crate, your lint triggers exacly once:
src/compile.rs:629:31: 629:36 error: binding's name is too similar to existing binding
src/compile.rs:629             MaybeInst::Split2(goto2) => {
                                                 ^~~~~
src/lib.rs:14:9: 14:22 note: lint level defined here
src/lib.rs:14 #![deny(similar_names)]
                      ^~~~~~~~~~~~~
src/compile.rs:626:31: 626:36 note: existing binding defined here
src/compile.rs:626             MaybeInst::Split1(goto1) => {
                                                 ^~~~~
src/compile.rs:629:31: 629:36 help: separate the discriminating character by an underscore like: `goto_2`
src/compile.rs:629             MaybeInst::Split2(goto2) => {
                                                 ^~~~~
src/compile.rs:629:31: 629:36 help: for further information visit https://github.com/Manishearth/rust-clippy/wiki#similar_names
  • On the regex-syntax crate twice:
src/lib.rs:342:35: 342:41 error: binding's name is too similar to existing binding
src/lib.rs:342                  Literal { chars: chars2, casei: casei2 }) => {
                                                 ^~~~~~
src/lib.rs:4:9: 4:22 note: lint level defined here
src/lib.rs:4 #![deny(similar_names)]
                     ^~~~~~~~~~~~~
src/lib.rs:341:44: 341:50 note: existing binding defined here
src/lib.rs:341                 (Some(Literal { chars: mut chars1, casei: casei1 }),
                                                          ^~~~~~
src/lib.rs:342:35: 342:41 help: separate the discriminating character by an underscore like: `chars_2`
src/lib.rs:342                  Literal { chars: chars2, casei: casei2 }) => {
                                                 ^~~~~~
src/lib.rs:342:35: 342:41 help: for further information visit https://github.com/Manishearth/rust-clippy/wiki#similar_names
src/lib.rs:342:50: 342:56 error: binding's name is too similar to existing binding
src/lib.rs:342                  Literal { chars: chars2, casei: casei2 }) => {
                                                                ^~~~~~
src/lib.rs:4:9: 4:22 note: lint level defined here
src/lib.rs:4 #![deny(similar_names)]
                     ^~~~~~~~~~~~~
src/lib.rs:341:59: 341:65 note: existing binding defined here
src/lib.rs:341                 (Some(Literal { chars: mut chars1, casei: casei1 }),
                                                                         ^~~~~~
src/lib.rs:342:50: 342:56 help: separate the discriminating character by an underscore like: `casei_2`
src/lib.rs:342                  Literal { chars: chars2, casei: casei2 }) => {
                                                                ^~~~~~
src/lib.rs:342:50: 342:56 help: for further information visit https://github.com/Manishearth/rust-clippy/wiki#similar_names
  • It does not trigger in regex_macros, toml, clap, serde.

At least it does not seem to lint too often as I feared.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 1, 2016

numbering binding names isn't a great way to differentiate names. Separating out the discriminating character with an underscore is much clearer imo. Especially in cases like lfident which is supposed to be the "left" "field" "identifier", which is 3 words and should be either left_field_identifier or l_f_ident short.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 1, 2016

but I think the warnings you are getting are a bug in my code anyway since they seem to be in different arms. I thought I was just checking whether a single scope ever had collisions.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 1, 2016

whoops. The issue was that scoping didn't work if it was inside the initialization expression of a let binding (like in the regex crate). That is fixed now. Doesn't change anything in the regex-syntax crate because there's a chars1 and a chars2 binding in the same arm.

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 8, 2016

rebased

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 11, 2016

ping?

@Manishearth
Copy link
Member

r? @mcarton

@Manishearth
Copy link
Member

Overall this is good, but I'd like @mcarton to sign off on it since he was the one reviewing.

I wonder how efficient this is, since we're running levenshtein across every combination of two names in a scope. Can someone run a before/after -Z time-passes on a largish (but not huge) crate?

@mcarton
Copy link
Contributor

mcarton commented Mar 11, 2016

@Manishearth on Cargo:

Without Clippy:
time: 0.019; rss: 96MB  early lint checks
time: 0.020; rss: 101MB early lint checks
time: 0.020; rss: 100MB early lint checks
time: 0.019; rss: 102MB early lint checks
time: 0.019; rss: 101MB early lint checks
time: 0.020; rss: 102MB early lint checks
time: 0.019; rss: 101MB early lint checks

With Manishearth:master:
time: 0.047; rss: 100MB early lint checks
time: 0.046; rss: 98MB  early lint checks
time: 0.045; rss: 98MB  early lint checks
time: 0.050; rss: 98MB  early lint checks
time: 0.045; rss: 98MB  early lint checks
time: 0.047; rss: 98MB  early lint checks
time: 0.048; rss: 98MB  early lint checks
time: 0.046; rss: 98MB  early lint checks
time: 0.046; rss: 98MB  early lint checks
time: 0.048; rss: 101MB early lint checks

With oli-obk:similar_names:
time: 0.154; rss: 100MB early lint checks
time: 0.162; rss: 102MB early lint checks
time: 0.157; rss: 103MB early lint checks
time: 0.152; rss: 102MB early lint checks
time: 0.155; rss: 98MB  early lint checks
time: 0.158; rss: 98MB  early lint checks
time: 0.158; rss: 98MB  early lint checks
time: 0.160; rss: 98MB  early lint checks
time: 0.157; rss: 98MB  early lint checks
time: 0.163; rss: 98MB  early lint checks

The impact seems important.

span,
&format!("{}th binding whose name is just one char",
self.0.single_char_names.len()));
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if foo { return; } bar(); return; sequence is a little weird here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, that should be if foo { a } else { b } return; iirc we have a lint that detects similar cases, I'll see if I can improve it.

@Manishearth
Copy link
Member

Oh, that's a significant impact :/

There probably are specialized datastructures for handling this but that might lead to a lot of churn.

What if we avoided levenshtein and instead did a simple similarity iteration?

@Manishearth
Copy link
Member

Actually, we don't need to calculate the levenshtein, we only need to find out if the distance is 1. Calculating the levenshtein is O(n^2), but determining if it's a given distance is O(n). Though for small n this may not matter much. shrug

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 14, 2016

for reference, the timings on my machine

just rustc
time: 0.019; rss: 99MB  early lint checks
time: 0.018; rss: 98MB  early lint checks
time: 0.018; rss: 99MB  early lint checks

debug clippy
time: 0.050; rss: 100MB early lint checks

debug clippy + similar_names
time: 0.145; rss: 101MB early lint checks

debug clippy + similar_names + early abort for names whose length differs by more than 1 char
time: 0.103; rss: 100MB early lint checks

debug clippy + similar_names + refactoring to not use levensthein
time: 0.081; rss: 101MB early lint checks

release clippy
time: 0.035; rss: 101MB early lint checks
time: 0.033; rss: 100MB early lint checks
time: 0.034; rss: 101MB early lint checks
time: 0.034; rss: 101MB early lint checks

release clippy + similar_names
time: 0.043; rss: 101MB early lint checks
time: 0.040; rss: 101MB early lint checks
time: 0.041; rss: 100MB early lint checks
time: 0.039; rss: 101MB early lint checks

release clippy + similar_names + refactoring to not use levensthein
time: 0.040; rss: 101MB early lint checks
time: 0.038; rss: 100MB early lint checks
time: 0.039; rss: 101MB early lint checks
time: 0.040; rss: 101MB early lint checks
time: 0.041; rss: 101MB early lint checks
time: 0.039; rss: 101MB early lint checks

Note that with release builds there's no(t much) difference between my two versions.

@@ -200,6 +201,9 @@ pub fn plugin_registrar(reg: &mut Registry) {
reg.register_late_lint_pass(box types::CharLitAsU8);
reg.register_late_lint_pass(box print::PrintLint);
reg.register_late_lint_pass(box vec::UselessVec);
reg.register_early_lint_pass(box non_expressive_names::NonExpressiveNames {
max_single_char_names: 5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #698 has been merged, this can be configured. Just sayin' 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 14, 2016

addressed comments

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 14, 2016

done

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 17, 2016

rebased

@oli-obk
Copy link
Contributor Author

oli-obk commented Mar 18, 2016

rebased

//~| NOTE: lint level defined here
//~| NOTE: lint level defined here
//~| NOTE: lint level defined here
//~| NOTE: lint level defined here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s weird, we never had to include that before. Must be because of the last upgrade of compiletest, but why is it needed only here?

@mcarton
Copy link
Contributor

mcarton commented Mar 18, 2016

I’m ok with that, @Manishearth?

Manishearth added a commit that referenced this pull request Mar 21, 2016
lint on binding-names that are too similar
@Manishearth Manishearth merged commit 523c596 into rust-lang:master Mar 21, 2016
@oli-obk oli-obk deleted the similar_names branch March 21, 2016 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants